Selectional preference acquisition through matrix factorization with missing data

نویسنده

  • Taesun Moon
چکیده

Words in an utterance are not placed in their respective slots randomly from a uniform distribution. In English, for example, a verb will rarely, if ever, follow a determiner. This is a syntactic restriction. From another perspective, one would not expect to find a word such as defenestration as the object of eat. This is what is known as the selectional preference of a word for another word in terms of its semantic domain. When this selectional preference is restricted to the arguments that a word may take on, it is called its subcategorization frame [16]. It is closely related to the problem of semantic disambiguation [24, 19, 20] whose central focus is the polysemy of words, but the definition of the problem admits more syntactic cues in that the disambiguation occurs strictly through arguments and their heads. In this paper, we lay out a novel, minimally supervised approach to inducing verb/object subcategorization frames using alternating minimization [27, 10, 4], a matrix factorization method commonly used when data is assumed to be missing. While missing data is generally not assumed in tasks dealing with semantic disambiguation or selectional preference acquisition, the pseudo-disambiguation method used to evaluate the task [18, 22, 5] is based on the assumption that there is. The pseudo-disambiguation task is a convenient but less rigorous alternative to human judgment for un-/minimally supervised approaches to inducing semantics. It is also the only such automated evaluation method for minimally supervised tasks involving semantics. While the approach at its core is based on minimizing a loss function:

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A non-negative tensor factorization model for selectional preference induction

Distributional similarity methods have proven to be a valuable tool for the induction of semantic similarity. Up till now, most algorithms use two-way cooccurrence data to compute the meaning of words. Co-occurrence frequencies, however, need not be pairwise. One can easily imagine situations where it is desirable to investigate co-occurrence frequencies of three modes and beyond. This paper wi...

متن کامل

Selectional preference acquisition through sparse principal component analysis

Words in an utterance are not placed in their respective slots randomly from a uniform distribution. In English, for example, a verb will rarely, if ever, follow a determiner. This is a syntactic restriction. From another perspective, one would not expect to find a word such as defenestration as the object of eat. This is what is known as the selectional preference of a word for another word in...

متن کامل

Word Sense Disambiguation For Acquisition Of Selectional Preferences

The selectional preferences of verbal predicates are an important component of lexical information useful for a number of NLP tasks including disambigliation of word senses. Approaches to selectional preference acquisition without word sense disambiguation are reported to be prone to errors arising from erroneous word senses. Large scale automatic semantic tagging of texts in sufficient quantit...

متن کامل

Detecting Compositionality of Verb-Object Combinations using Selectional Preferences

In this paper we explore the use of selectional preferences for detecting noncompositional verb-object combinations. To characterise the arguments in a given grammatical relationship we experiment with three models of selectional preference. Two use WordNet and one uses the entries from a distributional thesaurus as classes for representation. In previous work on selectional preference acquisit...

متن کامل

A Neural Network Approach to Selectional Preference Acquisition

This paper investigates the use of neural networks for the acquisition of selectional preferences. Inspired by recent advances of neural network models for nlp applications, we propose a neural network model that learns to discriminate between felicitous and infelicitous arguments for a particular predicate. The model is entirely unsupervised – preferences are learned from unannotated corpus da...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008